TOPIC ISLANDS - a wavelet-based text visualization system
نویسندگان
چکیده
We present a novel approach to visualize and explore unstructured text. The underlying technology, called TOPIC-O-GRAPHY, applies wavelet transforms to a custom digital signal constructed from words within a document. The resultant multiresolution wavelet energy is used to analyze the characteristics of the narrative flow in the frequency domain, such as theme changes, which is then related to the overall thematic content of the text document using statistical methods. The thematic characteristics of a document can be analyzed at varying degrees of detail, ranging from section-sized text partitions to partitions consisting of a few words. Using this technology, we are developing a visualization system prototype known as TOPIC ISLANDS to browse a document, generate fuzzy document outlines, summarize text by levels of detail and according to user interests, define meaningful subdocuments, query text content, and provide summaries of topic evolution.
منابع مشابه
A Fast Localization and Feature Extraction Method Based on Wavelet Transform in Iris Recognition
With an increasing emphasis on security, automated personal identification based on biometrics has been receiving extensive attention. Iris recognition, as an emerging biometric recognition approach, is becoming a very active topic in both research and practical applications. In general, a typical iris recognition system includes iris imaging, iris liveness detection, and recognition. This rese...
متن کاملReal-time Visualization of Streaming Text with Force-Based Dynamic System
An interactive visualization system, STREAMIT, enables users to explore text streams on-the-fly without prior knowledge of the data. It incorporates incoming documents from a continuous source into existing visualization context with automatic grouping and separation based on document similarities. STREAMIT supports interactive exploration with good scalability: First, keyword importance is adj...
متن کاملA review of text mining approaches and their function in discovering and extracting a topic
Background and aim: Four text mining methods are examined and focused on understanding and identifying their properties and limitations in subject discovery. Methodology: The study is an analytical review of the literature of text mining and topic modeling. Findings: LSA could be used to classify specific and unique topics in documents that address only a single topic. The other three text min...
متن کاملTwitter event detection: combining wavelet analysis and topic inference summarization
Today streaming text mining plays an important role within real-time social media mining. Given the amount and cadence of the data generated by those platforms, classical text mining techniques are not suitable to deal with such new mining challenges. Event detection is no exception, available algorithms rely on text mining techniques applied to pre-known datasets processed with no restrictions...
متن کاملCroVeWA: Crosslingual Vector-Based Writing Assistance
We present an interactive web-based writing assistance system that is based on recent advances in crosslingual compositional distributed semantics. Given queries in Japanese or English, our system can retrieve semantically related sentences from high quality English corpora. By employing crosslingually constrained vector space models to represent phrases, our system naturally sidesteps several ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998